首页> 外文OA文献 >Towards Adaptive Training of Agent-based Sparring Partners for Fighter Pilots
【2h】

Towards Adaptive Training of Agent-based Sparring Partners for Fighter Pilots

机译:基于agent的战斗机对打搭档的自适应训练   飞行员

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A key requirement for the current generation of artificial decision-makers isthat they should adapt well to changes in unexpected situations. This paperaddresses the situation in which an AI for aerial dog fighting, with tunableparameters that govern its behavior, must optimize behavior with respect to anobjective function that is evaluated and learned through simulations. Bayesianoptimization with a Gaussian Process surrogate is used as the method forinvestigating the objective function. One key benefit is that duringoptimization, the Gaussian Process learns a global estimate of the trueobjective function, with predicted outcomes and a statistical measure ofconfidence in areas that haven't been investigated yet. Having a model of theobjective function is important for being able to understand possible outcomesin the decision space; for example this is crucial for training and providingfeedback to human pilots. However, standard Bayesian optimization does notperform consistently or provide an accurate Gaussian Process surrogate functionfor highly volatile objective functions. We treat these problems by introducinga novel sampling technique called Hybrid Repeat/Multi-point Sampling. Thistechnique gives the AI ability to learn optimum behaviors in a highly uncertainenvironment. More importantly, it not only improves the reliability of theoptimization, but also creates a better model of the entire objective surface.With this improved model the agent is equipped to more accurately/efficientlypredict performance in unexplored scenarios.
机译:当前一代人为的决策者的关键要求是他们应该很好地适应意外情况的变化。本文解决了一种情况,在这种情况下,用于空中狗战斗的AI必须具有可控制其行为的可调参数,才能针对通过仿真评估和学习的目标功能优化行为。高斯过程替代的贝叶斯优化被用作研究目标函数的方法。一个主要好处是,在优化过程中,高斯过程会学习对真实目标函数的全局估计,并提供预测结果和对尚未调查的领域的信心进行统计测量。拥有目标功能的模型对于能够理解决策空间中可能的结果很重要;例如,这对于培训和向飞行员提供反馈至关重要。但是,标准贝叶斯优化不能始终如一地执行或不能为高度易变的目标函数提供准确的高斯过程替代函数。我们通过引入一种称为混合重复/多点采样的新颖采样技术来解决这些问题。该技术使AI能够在高度不确定的环境中学习最佳行为。更重要的是,它不仅提高了优化的可靠性,而且为整个目标表面创建了一个更好的模型。借助这种改进的模型,该代理可以更准确/有效地预测未开发场景中的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号